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I .  INTRODUCTION 


In  experimentation,  test  data  are  collected  on  one  or  more  variables  of 
interest.  The  objective  of  the  experiment  usually  is  to  determine  how  at 
least  one  other  variable  or  factor  is  associated  with  or  affects  another 
variable.  For  example,  striking  velocity  may  be  observed  for  a  number  of 
projectiles.  Associated  with  this  velocity  may  be  other  factors  such  as 
propellant  weight,  projectile  weight,  or  propellant  temperature.  For  this 
example,  let's  assume  that  we  have  measured  and  recorded  the  different  pro¬ 
pellant  weights  associated  with  the  striking  velocities.  A  graph  of  the  data 
is  given  in  Figure  1. 

In  looking  at  the  data,  there  seems  to  be  a  linear  relationship 
between  striking  velocity  and  propellant  weight.  A  common  question  is, 

"Can  this  data  be  represented  by  some  functional  relationship?"  In  this 
era  of  sophisticated  computers  and  calculators,  everyone  has  easy  access  to 
some  type  of  standard  regression  program  that  determines  the  linear  relation¬ 
ship  between  two  or  more  variables.  A  regression  of  velocity  (V)  on  propellant 
weight  (WT)  was  run  on  the  data  plotted  in  Figure  1.  The  following  equation 
was  obtained: 


Vi  =  4648.56  -  10.58  (WT.)  (1) 

This  equation  was  obtained  by  the  method  of  least  squares.  This  means  that  the 
sum  of  squares  of  vertical  deviations  of  observations  from  the  fitted  line  defined 
by  (1)  is  smaller  than  the  corresponding  sum  of  squares  of  deviations  from 
any  other  line.  These  deviations  from  the  fitted  regression  line  are  commonly 
referred  to  as  residuals.  For  this  example,  they  are  represented  by  the  math- 

^  A. 

ematical  expression  (residual  =  (V^  -  V^)),  where  is  obtained  by  Equation  1. 

However,  in  analyzing  the  data,  we  notice  from  Figure  2  that  three 
observations  identified  as  a,  b,  and  c  stand  out  from  the  majority  of  the 
data  and  think  that  these  observations  may  have  unduly  influenced  the  least 
squares  fit.  One  method  of  determining  the2strength  of  this  linear  fit  is  to 
look  at  the  Coefficient  of  Determination  (R  ) .  This  statistic  indicates  the 
proportion  of  variation  in  ^he  observations  (V. ' s)  that  are  explained  by 
the  linear  regression  line.  For  this  example*  only  16  percent  of  the  total 
variability  is  explained  by  Equation  1.  We  feel  that  this  is  not  a  good 
fit  and  that  most  of  the  data  can  be  better  represented  by  another  line.  So 
we  decided  to  exclude  the  points  a,  b,  and  c  and  compute  another  regression 
line.  The  following  equation  was  obtained: 

Vi*  =  2685.22  +  152.19  (WT.*)  (2) 

On  examining  this  line  (See  Figure  3) ,  it  was  found  that  it  represents 
the  data  much  better,  except  for  the  three  excluded  points.  After  excluding 

2 

the  three  outliers,  87  percent  of  the  variation  was  explained  (R  =  .87). 

However,  most  analysts  are  hesitant  to  throw  out  observations,  especially 
without  good  reason.  Also,  if  you  start  throwing  points  out,  how  do  you 
decide  when  to  s top ? 

'*'W.  Dixon,  F.  J.  Massey,  Introduction  to  Statistical  Analysis,  McGraw-Hill, 

Inc. ,  1969,  P.  328. 
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FIGURE  1.  Striking  Velocity  vs.  Propellant  Weight 
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Figure  2.  Least  Squares  Regression  of  Striking  Velocity  vs.  Propellant  Weight 
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Figure  3.  Least  Squares  Regression  of  Propellant  Weight  vs.  Striking  Velocity  with  Outliers  Removed 


One  solution  to  this  problem  is  to  amend  the  usual  least  squares  pro¬ 
cedure  by  transforming  the  original  data  so  that  distorted  observations  will 
not  have  undue  influence  on  the  fitted  least  squares  regression  line.  This 
leads  us  to  the  topic  of  the  paper,  robust  regression  using  Iteratively 
Reweighted  Least  Squares  (1RLS) .  A  computer  package,  Robust  Statistical 
Estimation  Package  (ROSEPACK) ,  which  computes  solutions  to  the  iteratively 
reweighted  least  squares  problem  and  is  available  on  the  Ballistic  Research 
Laboratory's  CDC  computer,  is  discussed. 


11.  ROBUST  REGRESSION 

A  well  known^statistician  by  the  name  of  George  E.  P.  Box  coined  the 
term  "robustness"  where  robust  techniques  are  defined  to  be  techniques  that 
are  insensitive  to  moderate  departures  from  the  underlying  assumptions.  The 
assumptions  for  least  squares  are  that  the  dependent  variable,  (for  this  example, 
striking  velocity)  is  normally  distributed  and  has  a  constant  variance 
2 

(o  for  any  given  independent  variable  (i.e.,  propellant  weight). 

Departures  from  the  assumption  of  normality  and/or  homogeneity  of 
variance  may  distort  the  least  square  fit  in  representing  the  true  linear 
relationship  that  is  demonstrated  by  the  bulk  of  the  data.  The  most  common 
of  these  departures  is  attributed  to  what  is  generally  referred  to  as  an 
"outlier"  or  "high  leverage  point."  Ther^  are  many  definitions  of  outliers 
and  many  techniques  for  identifying  them.  For  the  purpose  of  this  paper, 
let's  define  an  outlier  as  an  observation  that  has  undue  influence  on  the 
fitted  regression  line.  One  way  of  limiting  the  effects  of  these  outliers 
is  to  use  a  robust  regression  technique  called  Iteratively  Reweighted 
Least  Squares. 

A.  Iteratively  Reweighted  Least  Squares 

In  simple  terms.  Iteratively  Reweighted  Least  Squares  (1RLS)  is  a 
reweighted  least  squares  problem  where  the  weights  are  functions  of  the 
scaled  residuals.  The  basic  idea  is  to  transform  the  observations  by  multi¬ 
plying  them  by  values  of  a  weighting  function  so  that  they  appear  to  satisfy 
the  usual  assumptions.  Then,  one  uses  normal  least  squares  techniques 
on  these  weighted  observations.  The  weighting  values  are  not  fixed  but  are 
found  as  part  of  an  iterative  process .  There  are  many  different  weighting 
functions,  but  all  of  them  exhibit  the  following  property:  small  weighting 
values  are  assigned  to  observations  that  have  large  residuals  associated 
with  them.  The  solution  to  the  problem  is  obtained  by: 

1)  determining  an  initial  fit  of  the  data, 

2)  selecting  a  weight  function, 

3)  calculating  the  scaled  residuals. 


2 

R.  V.  Hogg,  "Statistical  Robustness:  One  View  of  its  Use  for  Application 
Today,"  The  American  Statistician,  August  1979,  V33,  P.  108-115. 

3 

P.  Holland,  R.  Welsch,  "Robust  Regression  Using  Iteratively  Reweighted 
Least  Squares,"  Communications  in  Statistics:  Theory  and  Methods,  1977. 
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4)  calculating  the  weighting  values  based  on  the  scaled  residuals, 

5)  determining  a  least  squares  regression  fit  based  on  the  weighted 
observations, 

6)  testing  to  see  if  this  fit  differs  from  the  previous  fit. 

Steps  4  through  6  are  continued  until  the  linear  expression  of  the 
data  converges.  Further  theoretical  discussion  on  robust  regression  estimates 
using  Iteratively  Reweighted  Least  Squares  is  outlined  in  Appendix  A. 

B.  Weighting  Function 

3 

Generally  speaking,  one  of  eight  weight  functions  is  commonly  used  as 
part  of  the  iterative  process  (see  Table  1) .  The  default  tuning  constant 
associated  with  each  function  provides  95  percent  asymptotic  efficiency  of 
the  estimated  regression  coefficients  with  respect  to  ordinary  least  squares 
when  the  errors  are  normally  distributed.  A  graph  of  the  values  of  each 
weight  function  vs.  the  scaled  residuals  is  given  in  Figure  4.  Weight 
values  of  one  indicate  ordinary  least  squares.  The  Fair  weight  function 
deviates  the  fastest  from  normal  least  squares,  and  in  fact,  was  derived  to 
approximate  the  sum  of  the  absolute  residual  regression.  On  the  other  hand, 
the  Welsch  weight  function  provides  estimates  that  resemble  the  normal  least 
square's  process.  The  Talwar  weight  function  behaves  like  normal  linear 
regression  except  that  observations  which  have  large  standardized  residuals 
are  excluded  from  the  analysis.  All  the  other  weight  functions  provide  esti¬ 
mates  that  lie  between  these  extremes. 

Going  back  to  the  previous  data,  an  Iteratively  Reweighted  Least  Squares 
Regression  is  performed  using  the  Cauchy  Weighting  Function.  The  following 
equation  was  obtained: 


V /  =  2887.1  +  135.33  (WT±)  (3) 


This  regression  line  represents  the  linear  trend  of  the  data  (Figure  5) 
without  having  to  disregard  any  of  the  observations.  The  equation  was 
obtained  by  giving  the  observations  the  following  weights. 
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TABLE  1.  WEIGHT  FUNCTIONS 
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Least  Squares  ■ 
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SCALED  RESIDUAL  VALUES 
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Figure  5.  Robust  Regression  of  Striking  Velocity  vs.  Propellant  Weight 


Observations 

1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 


Weight  Values 

. 2689253E+00 
.9679182E+00 
.  1609051E+00 
.9986705E+00 
•9165155E+00 
.  9973460E+00 
•9607954E+00 
.9607727E+00 
.8837554E+00 
.9570672E+00 
•9985002E+00 
.9891447E+00 
.9205748E+00 
•9970075E+00 
.8814305E+00 
.  9997461E+00 
.9092763E+00 
.  9999391E+00 
. 9772802E+00 
.3343834E+00 


All  of  the  observations  have  weighted  values  close  to  one  except  for  the  three 
outliers.  These  high  leverage  points  have  large  absolute  residuals  associated 
with  them.  Consequently,  the  Cauchy  weight  values  for  these  three  points  are 
less  than  0.4.  In  essence,  the  influence  of  these  observations  on  the  regres¬ 
sion  line  has  been  reduced  without  having  to  exclude  them  from  the  sample. 


III.  ROBUST  STATISTICAL  ESTIMATION  PACKAGE  (ROSEPACK) 

The  Massachusetts  Institute  of  Technology  (M.I.T.)  developed  a  Robust 
Statistical  Estimation  Package  (ROSEPACK) ,  which  the  Ballistic  Research 
Laboratory  (BRL)  obtained  and  installed  on  the  CDC  computer.  The  package  con¬ 
tains  internal  documentation,  a  primer  which  explains  the  control  statements. 


4 

D.  Coleman,  P.  Holland,  N.  Kaden,  V.  Klema,  S.C.  Peters,  "A  System  of 
Subroutines  for  Iteratively  Reweighted  Least  Squares  Computation," 
Massachusetts  Institute  of  Technology  (MIT),  December  1977. 
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an  interactive  driver  (IRLSDR) ,  and  computational  tools  for  computing  solutions 
to  the  Iteratively  Reweighted  Least  Squares  problem.  These  include  an  option 
for  choosing  an  initial  estimate,  a  default  convergence  level,  a  robust  esti¬ 
mate  for  the  scale  of  the  residuals,  and  the  eight  weight  functions  that  were 
previously  stated.  In  addition,  a  host  of  optional  statistical  output  is 
available.  The  package  can  be  used  interactively  or  in  batch  mode.  Further 
details,  including  the  job  stream  used  to  run  our  example,  are  provided  in 
Appendix  B. 


IV.  SUMMARY 

We  have  illustrated  the  effects  that  outliers  can  have  on  normal  least 
square  estimates  and  have  presented  robust  regression  using  ROSEPACK  as  a 
method  of  reducing  their  influence  on  the  regression  fit.  By  no  means  do  we 
suggest  that  robust  regression  should  replace  normal  least  squares,  but  we 
believe  it  should2be  incorporated  as  a  part  of  the  analysis  in  most  least 
squares  problems.  Each  analysis  should  include  both  least  squares  regression 
and  robust  regression  using  ROSEPACK.  If  the  estimates  obtained  from  these 
two  procedures  are  similar,  then  the  inferences  drawn  from  least  squares 
should  be  correct.  If,  however,  the  estimates  do  not  agree,  one  should 
reexamine  the  data,  paying  particular  attention  to  those  observations  with 
low  weights  from  the  robust  fit.  This  approach  will  force  the  analyst  to 
examine  his  data  and  assure  that  the  functional  relationship  being  fitted 
represents  the  bulk  of  the  data. 

Another  problem  that  occurs  in  regression  analyses  is  high  correlations 
between  the  various  predictors  or  independent  variables  in  the  model.  This 
problem  could  cause  unstable  parameter  estimates.  Although  it  is  not  addressed 
in  this  paper,  it  will  be  addressed  in  a  future  work  on  biased  regression 
estimation. 
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APPENDIX  A.  ITERATIVELY  REWEIGHTED  LEAST  SQUARES 

3  A1  A2 

Iteratively  Reweighted  Least  Squares  5  *  (IRLS)  is  a  computational  pro¬ 

cedure  to  produce  robust  linear  regression  estimates.  It  is  simply  a  reweighted 
least  squares  problem  where  the  weights  are  functions  of  the  scaled  residuals. 
Consider  the  standard  regression  model 

V  =  XB  +  e  (A-l) 

where  Y  is  the  n  vector  of  observations,  X  is  the  nxq  matrix  of  independent 
variables,  3  is  the  q  vector  of  coefficients,  and  e  is  the  n  vector  of  random 
errors . 

/\ 

A  robust  estimate  of  3,  3  minimizes 

n 

l  P((Y.-X  3)/S)  (A-2) 

i=l 

with  respect  to  3,  where  p  is  a  robust  Joss  function.  S  is  a  robust  esti¬ 
mate  of  scale  for  the  residuals  Yi  -  X.&.  Since  p  is  a  robust  loss  function, 

the  solution  to  the  fit  is  a  minimization  problem.  Equating  the  first  partial 
derivatives,  with  respect  to  the  elements  of  3  (3j)  equal  to  zero,  is  equiva¬ 
lent  to  finding  the  maximizing  solution  associated  with  the  q  equations 

/\ 

Y.-X.3 

P ' (  1g  1  )  =  o  for  j  =  1,2, ...  ,q 

d  (P((Yi-Xi3)/S)) 

d(3T)  = 


n 


X.  . 

=  1 


Since 


d  (p((Y.-X.3)/S)))  d((Y.-X.S)/S) 

d((Y.-X.3)/S)  x  d(S7) 


=  P'( 


Y.-X.3 

l  l 


)  (- 


X.  . 

s 


The  solution  to  this  problem  is  obtained  by  taking  Gauss-Newton  iterations: 


k+1 


=  3,  +  S(Xx  P  ,  X) 


-1  T 
X  P\ 


(A-3) 


where  P'^  =  [p'^C  — g - —  ),  •  ••>  p'(  ns  n  —  )]T;  XT  is  the  transpose  of  X 

and  P"k  is  an  nxn  diagonal  matrix. 

_ 

A.E.  Benton,  J.W.  Tukey,  "The  Fitting  of  Power  Series  Meaning  Polynomials, 
Illustrated  on  Band  Spectroscopic  Data,"  Technometrics,  V16,  P.  147-185 
A2 

L.  Penly,  W.  A.  Larson,  "Robust  Regression:  Communications  in  Statistics, 
Theory  and  Methodology, " A6 (4) ,  1977,  P.  335-362. 
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To  avoid  the  computational  expense  of  evaluating  P''^,  approximations  are 
often  made.  ROSEPACK  approximates  p"  as 


p"(x) 


pjOO 

X 


The  weighting  function  is  defined  as 


w.  ( 


Y.-X.3, 
1  1  k 


iv  S 


p,(Yi"XiBk/S) 


(A-4) 


Substituting  back  into  equation  (A-3)  we  obtain 


where 


T  1  T  VXiBk 
Bk+1  =  Bk  ♦  SIX  WKX)-  X  wKc  — —  ) 


f  f  ^  ^ 

Y.-X.3,  Y  -X  B,  ^ 

nr  _  r  r  1  1  k  i  C  n  n  k 

-  W1  ^  S  )»•••>  wn  (  s  • 


This  simplifies  to 

bm  -  pVj'1  xTwky 

Since  the  usual  least  squares  estimate  is 


T  -1 
(XX) 


T 

X  Y 


at  each  iteration  we  are  solving  the  weighted  least  squares  problem: 

1/2 (k)  _  wl/2(k)x  „ 

k+1 

/>v  A 

The  iteration  process  requires  a  start  value  for  B,  BQ,  which  can  be  obtained 

from  ordinary  least  squares,  or  from  least  absolute  residual  regression. 

3 

Holland  and  Welsh  suggest  using  the  least  absolute  residual  estimator 
as  a  "good  starting  value"  and  median  absolute  deviation  as  a  robust 
estimate  of  scale.  The  iteration  process  is  continued  until  convergence 
is  obtained. 
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APPENDIX  B.  ROSEPACK 


ROSEPACK  (Robust  Statistical  Estimation  Package)  is  a  system  of  portable 
FORTRAN  programs  and  an  interactive  driver  to  solve  iteratively  reweighted 
least  squares  problems.  The  interactive  driver  (IRLSDR) ,  through  a  set  of 
control  statements  that  can  be  set  up  as  a  batch  job,  governs  the  options 
available  to  solve  the  weighted  least  squares  problem. 

To  facilitate  the  understanding  of  ROSEPACK,  the  example  given  in  this 
report  will  be  explained  in  detail.  This  job  is  listed  in  Figure  B-l  and 
was  set  up  on  the  BRL/CYBER  170  computer  using  SYSTEM  DATA  of  the  SENATOR 
editor.  The  first  nine  lines  are  the  job  control  cards  needed  to  attach 
ROSEPACK  and  access  the  data  file.  These  statements  are  explained  below 
and  must  be  submitted  in  the  following  order: 

100  NAME. 

The  job  name  of  this  run.  Can  be  any  alpha-numeric  name  consisting  of 
up  to  fourteen  characters . 

110  USER  (XXXX,  XXXX) 

CYBER  user  identification.  Your  user  name  and  password  must  be  supplied 
in  the  parentheses. 

120  CHARGE  (XXXXXX,  XX) 

CYBER  charge  card.  Your  account  number  must  be  supplied  in  the  parentheses. 

130  GET  (ROSE  =  ROSESGO) 

Attaches  executable  source  ROSESGO  through  ROSE  to  run  ROSEPACK. 

140  GET  (RL  =  ROSELIB) 

Attaches  ROSEPACK  library  of  subroutines  through  RL. 

150  LIBRARY,  RL. 

Loads  ROSEPACK  library. 

160  GET  (TAPE 10  =  DATA) 

This  statement  copies  the  data  file  named  DATA  to  TAPE  10  for  access  to 
ROSEPACK.  Any  data  file  can  be  used  by  simply  replacing  DATA  with  the  name 
of  your  permanent  data  file.  Of  course,  this  is  assuming  your  file  is  stored 
on  MFA  of  the  CYBER  system.  The  data  file  should  be  in  fixed  column  format 
with  the  values  of  the  independent  variable  or  response  being  in  the  last 
column.  The  first  line  of  the  data  file  must  contain  information  representing 
the  number  of  observations,  the  number  of  independent  variables,  and  the 
acceptable  rank  tolerance  level  of  the  data  matrix.  A  matrix  whose  determi¬ 
nant  is  less  than  the  rank  tolerance  level  is  considered  to  be  singular. 
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100 

110 

120 

130 

140 

150 

160 

170 

180 

190 

200 

210 

220 

230 

240 

250 

260 

270 

280 

290 

300 

310 

320 

330 

340 

350 


NAME. 

USER(XXXX,XXXX) 
CHARGE (XXXXX, XX) 
GET (R0SE=R0SESG0) 
GET (RL=R0SELIB) 
LIBRARY,RL. 

GET (TAPE 10=DATA) 
ROSE (PL=20000) 
*EOR 
PRCO 
0 

PRCO 

1 

PRCO 

2 

PRCO 

3 

PRCO 

5 

PRCO 

6 

PRCO 

7 

PRCO 

4 

STEP 


360 

03 

370 

MODE 

380 

1 

390 

1HMA 

400 

1 

410 

PRIN 

420 

STAR 

430 

2 

440 

1WGT 

450 

02 

460 

TUN  I 

470 

2.38500000 

480 

STAT 

490 

3 

500 

STEM 

510 

1 

520 

MAXI 

530 

20 

540 

CONV 

550 

1 

560 

ALGO 

570 

1 

580 

OPTI 

590 

ITER 

600 

QUIT 

Figure  B-l.  Sample  Program. 
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A  rank  tolerance  of  -1.0  EO  was  used  for  the  example.  The  format  of  this 
line  is  215,  E16.7.  The  data  file  used  for  this  report  is  listed  in 
Figure  B-2. 

The  ROSEPACK  subroutine,  MATRDR,  which  reads  the  input  data  file, 
excluding  the  first  line,  must  be  modified  to  comply  with  the  format  of  the 
input  data.  A  short  program  named  ROS,  which  is  available  as  a  public  file 
on  the  front  end  of  the  BRL/CYBER  system,  was  written  to  make  this  change. 
The  program  is  listed  in  Figure  B-3.  In  using  this  program  to  make  the 
necessary  format  change,  one  should  do  the  following  before  submitting  the 
ROSEPACK  program: 

1)  Attach  ROS  in  IAF  mode 
GET (ROS/UN=PUBLIC) 

2)  Enter  Senator 
SENATOR 

3)  Attach  ROS  in  Senator  mode 
OLD, /ROS 

4)  Retype  lines  110  and  120  of  ROS.  Supply  your  name,  password 
and  account  number.  Be  sure  you  retype  the  line  number. 

5)  Retype  line  230  of  ROSE  to  comply  with  the  format  of  the 
dependent  and  independent  variable.  Use  standard  FORTRAN 
format  statements.  If  additional  lines  are  needed  use 
line  231,  etc. 

6)  List,  and  check  for  errors 
LIST 

7)  Submit  program 
SUBMIT 

8)  Exit  Senator  mode 
END 

170  ROSE  (PL=2000) 

This  statement  executes  ROSEPACK  and  allows  for  a  maximum  of  2000  pages, 

180  EOR 

Signifies  the  end  of  the  job  control  cards  and  the  beginning  of  the 
control  statement  that  governs  the  options  available  to  solve  the  weighted 
least  squares  problem  and  option  output  information. 

190  PRCO 

200  0 
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-l.OEO 


100 

20 

2 

110 

1.011.5 

4628.4 

120 

1.011.5 

4430.1 

130 

1.011.6 

4773.7 

140 

1.011.6 

4459 . 7 

150 

1.011.7 

4493.1 

160 

1.011.7 

4466.8 

170 

1.011.8 

4499.0 

180 

1.011.8 

4469.2 

190 

1.011.9 

4470.3 

200 

1.011.9 

4482.0 

210 

1.012.0 

4514.0 

220 

1.012.0 

4503.5 

230 

1.012.1 

4502.8 

240 

1.012.1 

4520.7 

250 

1.012.2 

4565.9 

260 

1.012.2 

4539.4 

270 

1.012.3 

4575.4 

280 

1.012.3 

4551.2 

290 

1.012.4 

4576.5 

300 

1.012.4 

4419.8 

Figure  B-2.  Data. 
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100 

ROSE. 

110 

USER (XXX, XXXX) 

120 

CHARGE (XXXX,  XX) 

130 

GET (DLDPL=ROSEPL) 

140 

UPDATE (Q) 

150 

FTN  (I=COMPILE) 

160 

REPLACE (LGO=ROSESGO) 

170 

*EOR 

180 

* I DENT  JOCK03 

190 

* COMPILE  IRLSDR, MATOA, MATOB , MATRDR 

200 

* INSERT  MATRDR  .117 

210 

M=MSAVE 

220 

♦DELETE  MATRDR. 125 

230 

9002  FORMAT (F3 . 1, F4. 1,X,F6. 1) 

240 

♦DELETE  MATOA. 60 

250 

CALL  MATRDR (MM,  NN,  M,  N, 

A,  SIGDIG) 

260 

♦DELETE  MATOB. 60 

270 

CALL  MATRDR (MM,  NN,  M,  N, 

B,  SIGDIG) 

Figure  B-3.  ROS  Format  Update. 
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The  PRCO  command  with  its  associated  number  controls  the  print  control 
vector  that  governs  the  optional  output.  The  command  should  be  one  of  the 
first  ROSEPACK  control  statements  specified  and  should  be  repeated  for 
each  output  option  desired  followed  by  its  appropriate  option  number.  The 
PRCO  command  followed  by  0  acts  as  an  on-off  switch.  The  first  time  these 
two  commands  are  encountered,  the  print  vector  is  turned  on.  The  second 
time  these  commands  are  read  the  print  vector  is  turned  off.  The  remaining 
print  vector  options  are  summarized  in  Table  B-l. 

350  STEP 

360  03 

The  STEP  command  followed  by  a  positive  integer  specifies  the  number  of 
iterations  between  the  printing  of  intermediate  results  which  must  be 
specified  by  MODE  command. 

370  MODE 

380  1 

Used  to  request  printing  of  intermediate  results.  Unless  specified,  there 
will  be  no  printing  of  intermediate  results. 

390  IHMA 

400  1 

Specifies  the  computing  and  printing  of  the  diagonal  of  the  hat  matrix. 

410  PRINT 

Allows  printing  of  the  desired  output  specified  by  the  print  options. 

420  STAR 
430  2 

Allows  the  user  to  choose  the  type  of  start  used  to  calculate  initial  estimates 
of  the  regression  coefficients.  There  are  three  starts  available  to  the 
user  that  are  specified  by  option  number  1,  2  or  3.  They  are  summarized 
below : 

1)  Least  Absolute  Residual  Start 

2)  Least  Squares  Start  using  singular  value  decomposition 

3)  Least  Squares  Start  using  the  householder  algorithm  (QR) 

440  IWGT 
450  02 
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TABLE  B- 

CONTROL 

STATEMENT 

PRCO 

0 

1 

2 

3 

4 

5 

6 

7 


.  PRCO  PRINT  OPTIONS 


COMMAND 


Initiates  Print  vector  for 
following  settings 


Sets  solution  vector 


Sets  residuals,  weighted 
diagonal  elements  of  hat  matrix 


Sets  convergence  level 


Sets  original  data  matrix 


Sets  singular  values  of  singular 
value  decomposition 


Sets  the  alpha  associated  with  QR 
decomposition 

Sets  upper  triangular  matrix  of 
QR  decomposition 
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The  IWGT  command  and  associated  option  number  allows  the  user  to  select  one 
of  eight  weight  functions  or  a  user  defined  weight  function.  The  different 
options  for  this  command  are  summarized  in  Table  B-2. 

460  TUNI 

470  2.38500000 

The  TUNI  command  allows  the  user  to  specify  a  tuning  constant  different 
from  the  default  tuning  constant  associated  with  each  weight  function.  The 
default  tuning  constants  are  given  in  Table  B-2. 

480  STAT 

490  3 

Specifies  the  printing  of  various  statistics  outlined  in  Table  B-3. 

500  STEM 
510  1 

The  STEM  command  with  option  number  1  specifies  a  Stem  and  Leaf  representation 
of  the  independent  variable. 

520  MAX  I 

530  20 

Specifies  the  maximum  number  of  iterations  per  'ITER'  command  if  convergence 
is  not  reached. 

540  CONV 

550  1 

The  CONV  command  allows  convergence  checking  after  each  iteration. 

560  ALGO 
570  1 

The  computational  algorithm  used  during  each  iteration  is  specified  by  this 
command.  A  zero  specifies  the  householder  algorithm.  A  one  represents  the 
Singular  Value  Decomposition  algorithm. 

580  OPTI 

Prints  list  of  options  in  effect. 

590  ITER 
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TABLE  B-2 .  WEIGHT  FUNCTIONS  AND  ASSOCIATED  TUNING  CONSTANT 

COMMAND  WEIGHT  DEFAULT  TUNING  CONSTANT 

IWGT 


0 

Andrews 

1.339 

1 

Bi  Square 

4.685 

2 

Cauchy 

2.385 

3 

Fair 

1.345 

4 

Huber 

1.400 

5 

Logistic 

1.205 

6 

Talwar 

2.795 

7 

Welsch 

2.985 

8 

User  Supplied 

- 

9 

Previously 

Defined 

- 
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TABLE  B-3.  STATISTIC 

OUTPUT  OPTIONS  GOVERNED  BY  STAT  COMMAND 

Option  Number 

* 

Statistic 

0 

Number  of  Observations 

Number  of  Variables 

Sum  of  Weights 

Condition  Number  of  Data  Matrix 
Maximum  Residual 

Minimum  Weight  t  T  1 

Maximum  Diagonal  Entry  of  X  (X  X)  X 

1 

All  of  Option  0 

Sum  of  Square  Residuals 

Weighted  Sum  of  Square  Residuals 

Sum  of  Absolute  Residuals 

2 

All  of  Options  0  and  1 

R- Square 

Weighted  R-Square 

Standard  Error 

Weighted  Standard  Error 

3 

All  of  Options  1,  2,  and  3 

F  Statistic 

Weighted  F  Statistic 
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Initiates  the  Iteratively  Reweighted  Least  Squares  Process  until  convergence 
or  the  maximum  number  of  iterations  is  reached.  May  be  repeated  with  the 
same  or  different  options. 

600  QUIT 

Exit  from  ROSEPACK 

Summarizing,  we  have  demonstrated  how  to  run  ROSEPACK  on  the  BRL/CYBER 
system.  Our  sample  run  performed  Iteratively  Reweighted  Least  Squares 
using  the  Cauchy  weight  function  with  its  default  tuning  constant.  A  maxi¬ 
mum  of  20  iterations  have  been  requested  with  intermediate  results  being 
printed  out  after  every  third  step.  All  of  the  print  options  and  statistics 
have  been  requested. 
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